The Listening Machine: 1st Annual Report
نویسنده
چکیده
In this first year of the project, our work was focused on the problem of identifying and separating specific sound sources in mixtures. The core of our approach is to use prior knowledge about the sounds in the world, encapsulated in some kind of model, to provide the constraints needed to solve the blind separation problem which is otherwise ill-posed. We have looked at using this approach in a reverberant multi-microphone case. In collaboration with Bhiksha Raj of MERL in Cambridge, we looked at setting the parameters of a filter-and-sum beamformer by doing gradient descent on the match between the separated signals and the constrained speech approximation resulting from the model means corresponding to the states of the best-match path found by a speech recognizer [9]. Beam-former parameters and speech recognizer state path parameters can be alternately re-estimated; we found this process to converge successfully after a few cycles. When just a single voice is present, this process amounts to blind estimation of a dereverberation filter. But we were more interested in the problem of multiple overlapping voices, which requires two initial speech recognizer state paths. This required a factorial-HMM model for the
منابع مشابه
Special issue on speech separation and recognition in multisource environments
One of the chief difficulties of building distant-microphone speech recognition systems for use in `everyday' applications is that the noise background is typically `multisource'. A speech recognition system designed to operate in a family home, for example, must contend with competing noise from televisions and radios, children playing, vacuum cleaners, and outdoors noises from open windows. D...
متن کاملNSF - CAREER : The Listening Machine Annual Report 2005
Continuing our broadened theme of machine listening in many contexts, in 2005 we conducted research into automatic extraction of information in complex sound mixtures, in 'personal audio' environmental recordings, from music audio, and for the sounds of marine mammals recorded underwater. 2005 saw the graduation of Manuel Reyes, the Ph.D. student supported by this project from the start. Manuel...
متن کاملNSF - CAREER : The Listening Machine IIS - 0238301 Annual Report 2007 Daniel
We have continued our research into associating words with the soundtracks of recordings of natural environments. We have been working with a database of 1400 “consumer videos” (collected by our collaborators at Kodak) as well as with similar amateur videos downloaded from YouTube. Based on a provisional lexicon of 25 terms that consumers might use as search terms (“music”, “birthday”, “beach”)...
متن کاملReport from the BIT’s 1st Annual World Congress of Biomedical Engineering Held in Xi’an, China, 9–11 November 2017
We are delighted to present within this meeting report the abstracts of the "BIT's 1st World Congress of Biomedical Engineering 2017" which has been hold in Xi'an in China [...].
متن کاملPancreas Transplantation and Report of 1st one in IRAN
SUMMARY Since 1923, the type I diabetic patients are treating with injections of insulin. Mortality of these patients decreased, comparing with noninsulin using patients, but many of them developed complications of diabetes mellitus, like nephropathy, retinopathy and neuropathy. The choice for treating this diseas and preventing its complications is pancrease transplantation, The 1st pancreas...
متن کامل